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Through his various examples, Professor Efron 
makes a convincing case that cutting-edge science 
requires methods for detecting multiple "non-nulls." 
These methods must be straightforward to imple- 
ment, but perhaps more importantly statisticians 
need to be able to justify them unambiguously. 
Efron's Empirical Bayes approach is certainly com- 
putationally efficient, but we feel the rationale for 
making each of his steps is unattractively ad hoc. 
This concern is practical, not philosophical; Efron's 
criterion for choice of tuning parameters seems to 
be that they look "believable." In less expert hands, 
this approach seems to introduce a lot of leeway for 
practitioners to simply "tune" away until they get 
the results they want. 

In an attempt to address this problem, we will 
describe an approach developed in a fully model- 
based framework. As with locfdr, the calculations 
are fast, but our whole analysis derives from clear 
up-front statements about what the analysis is try- 
ing to achieve, and the modeling assumptions made. 
The results look reassuringly similar to Professor 
Efron's. We hope this will be helpful for understand- 
ing the current paper, and in making a contribution 
to this general field. 

We begin by following Efron in placing the local 
false discovery rate, fdr(2:), as the primary focus of 
the analysis, and exploit the fact that it can offer a 
neat parameterization of the two-part model. If the 
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marginal, "mixture" density for the z-values is 

f{z)=pofo{z) + {l-po)fiiz) 
and idr{z) = pofo{z)/ f{z), then 



po 1 — fdr(z) 
l-po idi:{z) 



Mz). 



We observe that, because fi is a density, we only 
need to know /o and fdr in order to find its normal- 
ized form, and in turn this tells us the value of pq. 
Thus, for a given /o, specifying fdr sets up every- 
thing else we require for model-based analysis. 

Naturally, the analysis we report will depend on 
the functional form assumed for fdr, and Efron im- 
plicitly assumes a rather flexible form of fdr, through 
a seventh-order polynomial-smoothed density esti- 
mate. However, this approach does not rule out an 
fdr with multiple peaks. Thinking of the schools ex- 
ample, we would not want to be the statistician ex- 
plaining how two "bad" schools may have zi < Z2 < 
0, but yet fdr(zi) > 0.2 while fdr(2;2) < 0.2. Put more 
simply, Efron's method can report that School 1 has 
worse performance, but only School 2 is called an 
outlier. We flnd it more straightforward to a priori 
justify our choice of fdr by careful consideration of 
its role in the reported inference. 

In our experience, the search for non-null "discov- 
eries" is based around two ideas; first, we will not 
discover anything near the center of /o (effectively 
Efron's "zero assumption," also termed "purity" by 
Genovese and Wasserman, 2004). A second sensible 
assumption is that the evidence for z being "null" 
will decrease monotonically as we move out from 
the center. One way to satisfy this is with a logistic- 
linear form for fdr, giving a two-component normal 
mixture for /i, but we get closer to the spirit of 
Efron's analysis by assuming that fdr is unity in- 
side a central region, and then follows a half-normal 
decline, that is, 
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Following the observation above, taking the null com- 
ponent /o to be standard Normal, now defines the 
following marginal distribution f^{z): 

r ^-\z\ka + kl/2 ^ ^ 

/^(z)=Po(27rrV'- e-V2, ' -k,<z<k,, 

where the constant of proportionality is Pq, the pro- 
portion of nulls, which is an easily determined func- 
tion of ka and kb- 

f^{z) is seen to have a A^(0, 1) "core" and ex- 
ponential tails. By substituting {z — ^o)/^o for z 
in foiz) and fdr^(z), it is easily generalized to a 
full location-scale family, where the "core" (or null 
distribution) is now N{iJ,Q,aQ). We term this a "Hu- 
ber" distribution, denoted H{fiQ,ao,ka,kb), follow- 
ing the observation in Huber (1964) that his opti- 
mal robust location estimation procedure based on a 
piecewise-linear bounded influence function was pre- 
cisely equivalent to maximum likelihood estimation 
applied to such a distribution, but with ka = kb = k 
specified and do assumed known. 

Assuming this distribution and adopting a full 
likelihood approach, maximum likelihood estimates 
/io,o"o ci-re the solutions of estimating equations that 
take, up to a very good approximation, the same 
form as Ruber's famous "Type 2" estimator. We do 
not need to fix ka and kb; they can be estimated 
from the data in the same way. 

We have implemented maximum likelihood-based 
regression for this error distribution within our own 
R package (huber. Im), and also fully Bayesian 
MCMC approach via a new distribution, dhuber, 
within WinBUGS. 

Figure 1 and Table 1 show the results of fitting 
this distributional family to four of Efron's examples 
using huber. Im. 

In line with Efron, we assume that /o follows a 
N{fiQ,aQ) distribution, and provide point estimates 
for fio,cro,Po as well as ka,kb. We also show the fit- 
ted marginal distributions f^{z), QQ-plots of the z- 
values against f^{z) and a "naive" Normal, the fit- 
ted local false discovery rate fdr''^(z), and an appro- 
priately scaled representation of the "alternative" 
distribution fi . Figure 1 shows a good fit of the Hu- 
ber distribution to these examples. The fitted fdr''^ 
curves are also plotted, and these show a close con- 
cordance with Efron's locfdr results. For the BRCA 
data, we have not plotted fdr^, as use of the Huber 
distribution here gives estimates for both ka and kb 
tending to oo, and hence gives a point estimate of 



fdr = 1 for all data points. The practical message is 
clear; we find that the BRCA data, on its own, pro- 
vides no strong evidence of any signals beyond the 
fitted N{fj.,a'^) null, in line with Efron's results. The 
QQ-plot for the BRCA data provides further infor- 
mal confirmation. Other authors have declared some 
evidence for signals in this dataset, a recent example 
being Jin and Cai (2007). However, this is in con- 
trast to a Bayesian analysis with a uniform prior for 
ka and kb , which leads to a posterior for both ka and 
kb that rules out values less than 2 (po > 0.8%) and 
which provides an essentially uniform distribution 
for ka,kb>3 (po<0.02%). 

Table 1 provides parameter estimates for the asym- 
metric Huber distribution: likelihood ratio tests for 
common k are p = 0.68 (Prostate); p = 0.14 (Educa- 
tion); p = 0.007 (HIV). We find a close concordance 
between our results and those in Efron's paper. The 
estimated proportions of nonnull observations are 
1.7% (Prostate), 7.3% (Education) and 6.2% (HIV). 
As Pq is a slightly messy function of ka and kb, 

Po = V2^[er^"'Vka + e~''l/^/kb 

+ V2^{^{ka) + Hkb)-l)Y\ 

we have found it easiest to obtain intervals by us- 
ing an MCMC approach. However, using the delta 
method or a parametric bootstrap on the distribu- 
tion of the MLEs offers, in spirit, the same inference. 

In contrast to Efron's desire to "minimize the 
amount of statistical modeling required of the statis- 
tician," we would encourage statistical modeling where 
the modeling assumptions are clear and comprehen- 
sible; for example, we find a simply defined para- 
metric model preferable to Efron's seven-parameter 
polynomial-smoothed density estimate. Our explicit 
acknowledgment of these assumptions also motivates 
consideration (below) of how they may be usefully 
strengthened, and also whether they may be relaxed. 

Using a simple but fiexible fully parametric fam- 
ily such as the Huber distributions confers many ad- 
vantages. If we are willing to condition on the ade- 
quacy of the assumed model for f^{z), then the full 
resources of likelihood modeling become available, 
providing interval estimates, hypothesis tests and so 
on. In a hierarchical setting, the Huber distribution 
can also be considered at the random-effects level. 
Computationally this is handled with ease within 
a full Bayesian MCMC environment, where using 
H{fj,, a, k) or -ff (//, a, fca, kb) within a hierarchical model 
presents no additional difficulties over its use as a 
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Fig. 1. Summary plots fitting the Huber distribution to four examples. For each dataset, we plot histograms of the z-values 
and fitted marginal distribution, QQ-plots of the data against fitted Huber distribution (f^ ) and a naive pure Normal (f^ ), 
and finally a plot of the fitted fdr and the alternative distribution fi (inverted). For BRCA, the fitted fdr is always 1, giving 
no strong evidence of signals in this dataset. 
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Table 1 

Maximum likelihood estimates (95% intervals) for parameters of the asymmetric Huber distribution for four of Efron's 
examples; the intervals for po are obtained from an MCMC simulation 





Prostate 




Education 


BRCA 




HIV 


MO 


-0.001 (-0.031, 0.030) 


-0.361 


(-0.427, -0.295) 


-0.026 (-0.075, 0.023) 


-0.138 (- 


-0.161, -0.115) 




1.059 (1.030, 1.089) 


1.452 


(1.363, 1.546) 


1.431 (1.396, 1.466) 


0.760 


(0.730, 0.791) 


ka 


1.80 (1.61, 2.01) 


1.31 


(1.17, 1.48) 




1.40 


(1.28, 1.53) 


kb 


1.75 (1.59, 1.93) 


1.21 


(1.08, 1.37) 




1.26 


(1.17, 1.36) 


Po 


0.983 (0.975, 0.990) 


0.927 


(0.899, 0.950) 




0.938 


(0.921, 0.954) 



sampling distribution. Becoming "more" Bayesian 
still, we note the possibilities for use of informa- 
tive priors regarding the thresholds ka and kb, and 
hence implicitly po. In our opinion, analyses which 
acknowledge these a priori assumptions seem partic- 
ularly attractive for examples smaller than Efron's, 
where a reliable density estimate seems out of reach. 
Finally, a Bayesian modeling framework allows the 
inclusion of a model for such data within an inte- 
grated evidence synthesis, which can be guided by 
a combination of substantive knowledge and data 
analysis. 

Taking a less Bayesian or full-likelihood approach, 
and not wishing to condition on the "truth" of the 
model assumptions, one could proceed directly to 
Huber-style estimating equations for /iojfo and k 
(or ka and kf,), justified either through their con- 
nection to the model we have described, or by ar- 
guing that this influence function directly reflects 
the population parameter we want to estimate; if we 
are trying to minimize model-dependence, the sec- 
ond approach is more satisfactory, and is quite stan- 
dard in GEE. Sandwich and/or bootstrap variance 
estimates could be used to reflect uncertainty about 
these point estimates, without further parametric 
assumptions about the mixture distribution /. In 
samples of thousands of z's (but not with a few hun- 
dred), this provides appealingly robust estimates of 
location and scale. 



However, going beyond /Uq and do, it is not clear 
to us that the GEE paradigm allows "model-robust" 
measures of fdr. Must one compare the marginal / 
to an /o which is assumed to have a specifically 
Gaussian form, or that of some other parametric 
family? Might some advanced form of cross-validation 
offer a model-free approach? And could this be done 
without an excessive computational burden? Any in- 
sights from Professor Efron in this matter would be 
very welcome. 

In conclusion, we feel that flexible likelihood or 
Bayesian modeling techniques, combined with ba- 
sic insights from the literature on outlier-robustness, 
will contain much of value in the era of microar- 
rays and other data-sources requiring large numbers 
of hypothesis tests. We thank Professor Efron for 
his stimulating paper, and also for his generosity in 
making available the four featured datasets. 
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